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Your next 45 minutes on 

th e grav e yard sh i ft this lovely Saturday morning: 


• A bit of history 

• How do we perceive elevated sound? 

• Why include height at all? 

• How do different methods (re-)produce height? 

• A closer look at multichannel stereo techniques 

• VBAP 

• Ambisonics 
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A bit of History 
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A bit of History 


Mostly discrete routing, a bit of 
amplitude panning. 

Lots of fun with acoustics. 
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A bit of History 


Frangois Bayle's Acousmonium (Radio France): 
80 different speakers spread around, for live 
diffusion of tape music (usually stereophonic). 
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diffusion of tape music (usually stereophonic). 



The music is being diffused in 
real-time, usually by the 
composer, sitting at a mixing 
desk designed for the purpose. 
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A bit of History 


Frangois Bayle's Acousmonium (Radio France): 
80 different speakers spread around, for live 
diffusion of tape music (usually stereophonic). 



The music is being diffused in 
real-time, usually by the 
composer, sitting at a mixing 
desk designed for the purpose. 

Speakers are chosen for their 
different characteristics. 
Localisation is part of the 
interpretation, but not 
independent of speaker 
positions. 
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A bit of History 


Frangois Bayle's Acousmonium (Radio France): 
80 different speakers spread around, for live 
diffusion of tape music (usually stereophonic). 



The music is being diffused in 
real-time, usually by the 
composer, sitting at a mixing 
desk designed for the purpose. 

Speakers are chosen for their 
different characteristics. 
Localisation is part of the 
interpretation, but not 
independent of speaker 
positions. 

A modern-day successor is the 
BEAST (Birmingham Electro- 
Acoustic Sound Theatre). 
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A bit of History 


Systems like these are not aiming 
at a systematic, portable approach 
to with-height surround. 





MSMift 




L Uiinr^V f 




.2012 


The Why and How of With-Height Surround Sound - Jorn Nettingsmeier <nettings@stackingdwarves.net> 
Linux Audio Conference 2012, CCRMA, Stanford University 9 






A bit of History 


Systems like these are not aiming 
at a systematic, portable approach 
to with-height surround. 

They are part of the artwork, and 
of the creative process. 
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Systems like these are not aiming 
at a systematic, portable approach 
to with-height surround. 

They are part of the artwork, and 
of the creative process. 

Their strengths can be exploited in 
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Systems like these are not aiming 
at a systematic, portable approach 
to with-height surround. 

They are part of the artwork, and 
of the creative process. 


Their strengths can be exploited in 


Their deficiencies mark 
important artistic constraints, 
which are either fought against 
or put to use. 
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A bit of History 

Systems like these are not aiming 
at a systematic, portable approach 
to with-height surround. 

They are part of the artwork, and 
of the creative process. 


Their strengths can be exploited in 


Their deficiencies mark 
important artistic constraints, 
which are either fought against, 
or put to use. 

In any case, they are integral 
parts of the artwork, too. 


rlpnth 
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A bit of History 



That is of course a brilliant 
excuse :-D 
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A bit of History 



That is of course a brilliant 
excuse :-D 


Let's look instead at systems that 


• aim for widespread deployment in a wider potential market 
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A bit of History 



• aim for widespread deployment in a wider potential market 

• aim to reproduce content by third parties 
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A bit of History 



That is of course a brilliant 
excuse :-D 


Let's look instead at systems that 


• aim for widespread deployment in a wider potential market 

• aim to reproduce content by third parties 

• define clearly how the system should be implemented 
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A bit of History 


• Michael Gerzon 1973, periphonic (i.e. with- 
height) surround sound using 4 channels: 

B-format. 

• loudspeaker layout agnostic 

• scalable 


In 1992, Gerzon proposed this as a candidate 
format for HDTV. Alas, ... 
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A bit of History 


• Tomlinson Holman, 1999: eight speakers on the 
horizontal plane (with heavy frontal bias), two 
subs left and right, and two elevated frontal 
speakers: 10.2 


• speaker feed mixing 
(“Twice as good as 5.1”) 
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A bit of History 


• Werner Dabringhaus, 1999: front left/right, rear 
left/right, elevated front left/right: 2+2+2 

• stereo-pairwise mixing using traditional miking 
techniques 

Designed to work on DVD-Audio, with the 5 
plus 1 channels available. Some tricks to 
ensure a meaningful (although compromised) 
image when played back over an ITU 5.1 rig. 
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A bit of History 


• Wilfried van Baelen (Galaxy Studios), 2005: 
an ITU 5.1 system with elevated speakers above 
L, R, Ls and Rs: Auro-3D 

• same basic idea, yet more channels 


The proposal includes some neat encoding tricks 
to funnel 10 (or more) signals into 5.1 carriers, or 
into the 8 PCM streams of a Blu-ray disc. 
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A bit of History 


• Kimio Hamasaki et. al, 2005 (NHK): ten 
horizontal channels, eight elevated channels, 
one “voice of God”, three front low channels, 
two subs: 

22.2 

Designed as a complement to the proposed 
Ultra-HDTV standard for total immersion. 
Again, more channels... 
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And back in the present... 


It seems there are many variations on the 
theme. 

Now let's all go pick an arbitrary pair {N.M} and 
stick our names on it. 
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My own humble claim to 
fame is: 



The Why and How of With-Height Surround Sound - Jorn Nettingsmeier <nettings@stackingdwarves.net> 
Linux Audio Conference 2012, CCRMA, Stanford University 


24 




UN U 


My own humble claim to 
fame is: 
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44.4 



The Why and How of With-Height Surround Sou 
Linux Audio Conference 2012, CCRMA, Stanfor] 






ByKj/ 










My own humble claim to 
fame is: 


44.4 


Eat my dust, Kimio :-D 
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That is of course just a 
joke. 

The system was used 
for IOSONO playback, 
and higher-order 
Ambisonics. 
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Learning from History 


• Except for Ambisonics, all proposals share the 
same paradigms/problems 

• more and more channels without real up- and 
downwards compatibility 

• frontal bias 

• speaker-feed mixing 

• underspecified signal relationships (correlation etc.) 
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Perception 
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How do we perceive direction? 

Left/right (horizontal) cues are 

• interaural time difference ITD (at LF) 

• no head shading (perfect diffraction) 

• unambiguous phase (wavelength > 2x ear dist.) 
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How do we perceive direction? 

Left/right (horizontal) cues are 

• interaural time difference ITD (at LF) 

• no head shading (perfect diffraction) 

• unambiguous phase (wavelength > 2x ear dist.) 

• interaural level difference ILD (at HF) 

• head shading 

• ambiguous phase! 
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How do we perceive height? 

How about a source that moves up on the 
median plane (i.e. right in front of us)? 

• constant ITD, no cue 

• constant ILD, no cue 

-> All we have is a slight change of tone colour, 
due to ear flaps {pinnae ) and head/torso effects. 
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How do we perceive height? 

If it's just tone colour, how do we perceive 
height when we don't know the uncoloured 
sound? 
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How do we perceive height? 

If it's just tone colour, how do we perceive 
height when we don't know the uncoloured 
sound? 

Short answer: we don't. 
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How do we perceive height? 

If it's just tone colour, how do we perceive 
height when we don't know the uncoloured 
sound? 

Short answer: we don't. 


Long answer: we do not. 
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How do we perceive height? 

If it's just tone colour, how do we perceive 
height when we don't know the uncoloured 
sound? 

Short answer: we don't. 


Long answer: we do not. But some narrowband 
signals suggest height regardless of the actual 
source elevation (Blauert, 1983). 
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How do we perceive height? 

Humans don't perceive height very well. Signal 
semantics dominate: 

• Airplane? must be up. Birds, likewise. 

• Footsteps? flowing water? down. 

And if you see a source, that's where you hear 
it, usually (multi-modal perception). 
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How do we perceive height? 

But: 

• We can move our head to direct the more acute 
horizontal localisation mechanisms at any 
source. 

• We can “explore” a sound field at leisure. 
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Then why bother? 
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Why include height? 


A few common claims, supported by personal 
experience (not research): 
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• improved immersion/envelopment 
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Why include height? 


A few common claims, supported by personal 
experience (not research): 

• improved immersion/envelopment 

• increased robustness against listening room 
problems 
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Why include height? 


A few common claims, supported by personal 
experience (not research): 

• improved immersion/envelopment 

• increased robustness against listening room 
problems 

• enlarged usable listening area 

• more natural timbre 
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Why include height? 


A few common claims, supported by personal 
experience (not research): 

• improved immersion/envelopment 

• increased robustness against listening room 
problems 

• enlarged usable listening area 

• more natural timbre 

• height localisation 
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Why include height? 

A few common claims, supported by personal 
experience (not research): 

• improved immersion/envelopment 

• increased robustness against listening room 
problems 

• enlarged usable listening area 



ore natural timbre 


Nobody cares! 


The Why and How of With-Height Surround Sound - Jorn Nettingsmeier <nettings@stackingdwarves.net> 
Linux Audio Conference 2012, CCRMA, Stanford University 


47 




Uses for height localisation 


• better audibility of complex structures due to 
vertical separation: e.g. organ music 

• more precise reproduction of room acoustics: 
characteristic ceiling reflections 

• use of location as a precisely audible musical 
parameter, like pitch and duration 

• discrete sources at height: elevated choirs or 
solo instruments, opera scenes 
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Height reproduction in Stereo 


• Stereo := using stereophonic techniques 

• level differences in speaker pairs (=artificial ILD) 

• time differences in speaker pairs (=artificial ITD) 


But: not used on the median plane. 

Tone colour for any given height is not the sum of 
upper speaker tone colour plus lower speaker 
tone colour weighted by relative amplitude. 
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Height reproduction in Stereo 


Hence: ILD/ITD not much use for height, 
steep localisation curve. 

Bottomline: it's either on the bottom speaker, or 
on the upper speaker. 

No stable auditory events in between (however, 
suggesting quick vertical movement is 
possible). 
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Height reproduction in Stereo 


Artificially delivered ITD/ILD fall apart when the 
listener's head is rotated away from the frontal 
upright orientation. 

Don't move! 
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Height reproduction in Ambisonics 


Ambi attempts to get the soundfield correct, to 
some degree. 

In a correct soundfield, you can move any way 
you like and collect useful cues. 

Once your brain has locked onto a cue, 
localisation remains stable even if you move. 
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Bottom line: 


• Only Higher-order Ambisonics and VBAP can 
create meaningful and stable auditory events at 
continuously variable elevation. 
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Is with-height surround really worth the 

trouble? 
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Depends. 
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Thanks for your attention. 

I'm looking forward to your remarks and 

questions. 
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